Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Task scheduling strategy based on data stream classification in Heron
ZHANG Yitian, YU Jiong, LU Liang, LI Ziyang
Journal of Computer Applications    2019, 39 (4): 1106-1116.   DOI: 10.11772/j.issn.1001-9081.2018081848
Abstract462)      PDF (1855KB)(330)       Save
In a new platform for big data stream processing called Heron, the round-robin scheduling algorithm is usually used for task scheduling by default, which does not consider the topology runtime state and the impact of different communication modes among task instances on Heron's performance. To solve this problem, a task scheduling strategy based on Data Stream Classification in Heron (DSC-Heron) was proposed, including data stream classification algorithm, data stream cluster allocation algorithm and data stream classification scheduling algorithm. Firstly, the instance allocation model of Heron was established to clarify the difference in communication overhead among different communication modes of the task instances. Secondly, the data stream was classified according to the real-time data stream size between task instances based on the data stream classification model of Heron. Finally, the packing plan of Heron was constructed by using the interrelated high-frequency data streams as the basic scheduling unit to complete the scheduling to minimize the communication cost by transforming inter-node data streams into intra-node ones as many as possible. After running SentenceWordCount, WordCount and FileWordCount topologies in a Heron cluster environment with 9 nodes, the results show that compared with the Heron default scheduling strategy, DSC-Heron has 8.35%, 7.07% and 6.83% improvements in system complete latency, inter-node communication overhead and system throughput respectively; in the load balancing aspect, the standard deviations of CPU usage and memory usage of the working nodes are decreased by 41.44% and 41.23% respectively. All experimental results show that DSC-Heron can effectively improve the performance of the topologies, and has the most significant optimization effect on FileWordCount topology which is close to the real application scenario.
Reference | Related Articles | Metrics